5 research outputs found

    A 249-Mpixel/s HEVC Video-Decoder Chip for 4K Ultra-HD Applications

    Get PDF
    High Efficiency Video Coding, the latest video standard, uses larger and variable-sized coding units and longer interpolation filters than [H.264 over AVC] to better exploit redundancy in video signals. These algorithmic techniques enable a 50% decrease in bitrate at the cost of computational complexity, external memory bandwidth, and, for ASIC implementations, on-chip SRAM of the video codec. This paper describes architectural optimizations for an HEVC video decoder chip. The chip uses a two-stage subpipelining scheme to reduce on-chip SRAM by 56 kbytes-a 32% reduction. A high-throughput read-only cache combined with DRAM-latency-aware memory mapping reduces DRAM bandwidth by 67%. The chip is built for HEVC Working Draft 4 Low Complexity configuration and occupies 1.77 mm[superscript 2] in 40-nm CMOS. It performs 4K Ultra HD 30-fps video decoding at 200 MHz while consuming 1.19 [nJ over pixel] of normalized system power.Texas Instruments Incorporate

    A 249Mpixel/s HEVC video-decoder chip for Quad Full HD applications

    Get PDF
    The latest video coding standard High Efficiency Video Coding (HEVC) provides 50% improvement in coding efficiency compared to H.264/AVC, to meet the rising demand for video streaming, better video quality and higher resolutions. The coding gain is achieved using more complex tools such as larger and variable-size coding units (CU) in a hierarchical structure, larger transforms and longer interpolation filters. This paper presents an integrated circuit which supports Quad Full HD (QFHD, 3840×2160) video decoding for the HEVC draft standard. It addresses new design challenges for HEVC (“H.265”) with three primary contributions: 1) a system pipelining scheme which adapts to the variable-size largest coding unit (LCU) and provides a two-stage sub-pipeline for memory optimization; 2) unified processing engines to address the hierarchical coding structure and many prediction and transform block sizes in area-efficient ways; 3) a motion compensation (MC) cache which reduces DRAM bandwidth for the LCU and meets the high throughput requirements which are due to the long filters.Texas Instruments Incorporate

    Algorithms, architectures and circuits for low power HEVC codecs

    No full text
    Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.30Cataloged from PDF version of thesis.Includes bibliographical references (pages 81-84).In order to satisfy the demand for high quality video streaming, aggressive compression is necessary. High Efficiency Video Coding (HEVC) is a new standard that has been designed with the goal of satisfying this need in the coming decade. For a given quality, of video HEVC offers 2x better compression than existing standards. However, this compression comes at the cost of a commensurate increase in complexity. Our work aims to control this complexity in the context of real-time hardware video codecs. Our work focused on two specific areas: Motion Compensation Bandwidth and Intra Estimation. HEVC uses larger filters for motion compensation leading to a significant increase in decoder bandwidth. We present a novel motion compensation cache that reduces external memory bandwidth by 67% and power by 40%. The use of large, variable-sized coding units and new prediction modes results in a dramatic increase in the search space of a video encoder. We present novel intra estimation algorithms that substantially reduce encoder complexity with a modest 6% increase in BD-rate. These algorithms are co-designed with the hardware architecture allowing us to implement them within reasonable hardware constraints.by Chiraag Juvekar.S.M

    Hardware and protocols for authentication and secure computation

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 149-162).The Internet of Things has resulted in an exponential rise in the number of embedded electronic devices. This thesis deals with ensuring the security of these embedded devices. In particular we focus our attention on two problems: first we look at how these devices can convince another of their identity i.e. authentication and second we look at how these devices and cloud servers can compute joint functions of their private inputs while revealing nothing but the computation results to the other i.e. secure computation. We start with the problem of counterfeit detection through electronic tagging. Physical access to electronic tags can be leveraged to mount side-channel and fault injection attacks. We design a new tagging solution that leverages ferro-electric capacitor based non volatile memory to addresses these issues. Next we note that resource constraints imposed by embedded devices often preclude the use of public-key cryptography. We address this issue through the development of a lightweight (10k-Gate) Elliptic Curve accelerator for the K-163 curves, which allows us to build a secure wireless-charging system that can block power from counterfeit and potentially dangerous chargers. Next we build upon these insights to develop a new authentication protocol which combines the leakage resilience and public-key authentication properties of our previous tagging solutions. We implement this bilinear pairing based protocol on a RISCV processor and demonstrate its practicality in an embedded environment through reuse of existing hardware accelerated cryptography for the TLS protocol. The final part of this thesis develops a framework for secure two-party computation. Our primary contribution is a judicious combination of homomorphic encryption and garbled circuits to substantially improve the performance of secure two-party computation. This allows us to present a practical solution to the problem of secure neural network inference, i.e. classifying your private data against a server's private model without either party sharing their data with the other. Our hybrid approach improves upon the state-of-art by 20-30 x in classification latency. Our final contributions are two efficient 2PC protocols that implement secure matrix multiplication and vector-OLE primitives. For both these tasks we improve concrete computation and communication performance over the state-of-art by an order of magnitude.by Chiraag Juvekar.Ph. D

    A Keccak-Based Wireless Authentication Tag with per-Query Key Update and Power-Glitch Attack Countermeasures

    No full text
    Counterfeiting is a major problem plaguing global supply chains. While small low-cost tagging solutions for supply-chain management exist, security in the face of fault-injection [1] and side-channel attacks [2] remains a concern. Power glitch attacks [3] in particular attempt to leak key-bits by inducing fault conditions during cryptographic operation through the use of over-voltage and under-voltage conditions. This paper presents the design of a secure authentication tag with wireless power and data delivery optimized for compact size and near-field applications. Power-glitch attacks are mitigated through state backup on FeRAM based non-volatile flip-flops (NVDFFs) [4]. The tag uses Keccak [5] (the cryptographic core of SHA3) to update the key before each protocol invocation, limiting side-channel leakage to a single trace per key. Fig. 1 shows the complete system including the tag, reader, and backend server implemented in this work. Tags are seeded at manufacture and this initial seed is stored in the server database before a tag is affixed to an item. A wireless power and data transfer (WPDT) frontend harvests energy from the reader (433 MHz inductive link) and powers the on-chip authentication engine (AE). On startup the AE updates its key using a PRNG (seeded with the old key) and increments the key index. The AE then responds to the subsequent challenge, by encrypting the challenge under the new key. These challenge-response pairs can be validated by a trusted server to authenticate the tag. Additionally, the server can use the key-index to resynchronize with the tag in the event of packet loss.Denso (Firm)Texas Instruments Incorporate
    corecore